AITopics | academic article

Collaborating Authors

academic article

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enhancing Keyphrase Extraction from Academic Articles Using Section Structure Information

Zhang, Chengzhi, Yan, Xinyi, Zhao, Lei, Zhang, Yingyi

arXiv.org Artificial IntelligenceMay-21-2025

The exponential increase in academic papers has significantly increased the time required for researchers to access relevant literature. Keyphrase Extraction (KPE) offers a solution to this situation by enabling researchers to efficiently retrieve relevant literature. The current study on KPE from academic articles aims to improve the performance of extraction models through innovative approaches using Title and Abstract as input corpora. However, the semantic richness of keywords is significantly constrained by the length of the abstract. While full-text-based KPE can address this issue, it simultaneously introduces noise, which significantly diminishes KPE performance. To address this issue, this paper utilized the structural features and section texts obtained from the section structure information of academic articles to extract keyphrase from academic papers. The approach consists of two main parts: (1) exploring the effect of seven structural features on KPE models, and (2) integrating the extraction results from all section texts used as input corpora for KPE models via a keyphrase integration algorithm to obtain the keyphrase integration result. Furthermore, this paper also examined the effect of the classification quality of section structure on the KPE performance. The results show that incorporating structural features improves KPE performance, though different features have varying effects on model efficacy. The keyphrase integration approach yields the best performance, and the classification quality of section structure can affect KPE performance. These findings indicate that using the section structure information of academic articles contributes to effective KPE from academic articles. The code and dataset supporting this study are available at https://github.com/yan-xinyi/SSB_KPE.

data mining, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11192-025-05286-2

2505.14149

Country: Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(7 more...)

Add feedback

Public interest in science or bots? Selective amplification of scientific articles on Twitter

Rahman, Ashiqur, Mohammadi, Ehsan, Alhoori, Hamed

arXiv.org Artificial IntelligenceSep-28-2024

With the remarkable capability to reach the public instantly, social media has become integral in sharing scholarly articles to measure public response. Since spamming by bots on social media can steer the conversation and present a false public interest in given research, affecting policies impacting the public's lives in the real world, this topic warrants critical study and attention. We used the Altmetric dataset in combination with data collected through the Twitter Application Programming Interface (API) and the Botometer API. We combined the data into an extensive dataset with academic articles, several features from the article and a label indicating whether the article had excessive bot activity on Twitter or not. We analyzed the data to see the possibility of bot activity based on different characteristics of the article. We also trained machine-learning models using this dataset to identify possible bot activity in any given article. Our machine-learning models were capable of identifying possible bot activity in any academic article with an accuracy of 0.70. We also found that articles related to "Health and Human Science" are more prone to bot activity compared to other research areas. Without arguing the maliciousness of the bot activity, our work presents a tool to identify the presence of bot activity in the dissemination of an academic article and creates a baseline for future research in this direction.

artificial intelligence, machine learning, social media, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1108/AJIM-01-2024-0050

2410.01842

Country:

North America > United States > South Carolina > Richland County > Columbia (0.14)
North America > United States > Illinois > DeKalb County > DeKalb (0.04)
North America > United States > New York > New York County > New York City (0.04)
(14 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Services (1.00)
Information Technology > Security & Privacy (0.70)
Media (0.67)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

The Future of Scientific Publishing: Automated Article Generation

Harper, Jeremy R.

arXiv.org Artificial IntelligenceApr-11-2024

This study introduces a novel software tool leveraging large language model (LLM) prompts, designed to automate the generation of academic articles from Python code a significant advancement in the fields of biomedical informatics and computer science. Selected for its widespread adoption and analytical versatility, Python served as a foundational proof of concept; however, the underlying methodology and framework exhibit adaptability across various GitHub repo's underlining the tool's broad applicability (Harper 2024). By mitigating the traditionally time-intensive academic writing process, particularly in synthesizing complex datasets and coding outputs, this approach signifies a monumental leap towards streamlining research dissemination. The development was achieved without reliance on advanced language model agents, ensuring high fidelity in the automated generation of coherent and comprehensive academic content. This exploration not only validates the successful application and efficiency of the software but also projects how future integration of LLM agents which could amplify its capabilities, propelling towards a future where scientific findings are disseminated more swiftly and accessibly.

academic article, python code, software tool, (16 more...)

arXiv.org Artificial Intelligence

2404.17586

Country: North America > United States > Indiana > Marion County > Indianapolis (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.76)

Add feedback

A Framework For Refining Text Classification and Object Recognition from Academic Articles

Li, Jinghong, Ota, Koichi, Gu, Wen, Hasegawa, Shinobu

arXiv.org Artificial IntelligenceAug-15-2023

With the widespread use of the internet, it has become increasingly crucial to extract specific information from vast amounts of academic articles efficiently. Data mining techniques are generally employed to solve this issue. However, data mining for academic articles is challenging since it requires automatically extracting specific patterns in complex and unstructured layout documents. Current data mining methods for academic articles employ rule-based(RB) or machine learning(ML) approaches. However, using rule-based methods incurs a high coding cost for complex typesetting articles. On the other hand, simply using machine learning methods requires annotation work for complex content types within the paper, which can be costly. Furthermore, only using machine learning can lead to cases where patterns easily recognized by rule-based methods are mistakenly extracted. To overcome these issues, from the perspective of analyzing the standard layout and typesetting used in the specified publication, we emphasize implementing specific methods for specific characteristics in academic articles. We have developed a novel Text Block Refinement Framework (TBRF), a machine learning and rule-based scheme hybrid. We used the well-known ACL proceeding articles as experimental data for the validation experiment. The experiment shows that our approach achieved over 95% classification accuracy and 90% detection accuracy for tables and figures.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.17401

Country:

Asia > Japan (0.04)
Europe > Greece > Ionian Islands > Corfu (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Materials > Metals & Mining (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.48)

Add feedback

AI Literature Review Suite

Tovar, David A.

arXiv.org Artificial IntelligenceJul-27-2023

The process of conducting literature reviews is often time-consuming and labor-intensive. To streamline this process, I present an AI Literature Review Suite that integrates several functionalities to provide a comprehensive literature review. This tool leverages the power of open access science, large language models (LLMs) and natural language processing to enable the searching, downloading, and organizing of PDF files, as well as extracting content from articles. Semantic search queries are used for data retrieval, while text embeddings and summarization using LLMs present succinct literature reviews. Interaction with PDFs is enhanced through a user-friendly graphical user interface (GUI). The suite also features integrated programs for bibliographic organization, interaction and query, and literature review summaries. This tool presents a robust solution to automate and optimize the process of literature review in academic and industrial research.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2308.02443

Country: North America > United States > Tennessee > Davidson County > Nashville (0.04)

Genre: Overview (1.00)

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Extracting Blockchain Concepts from Text

Veiga, Rodrigo, Endler, Markus, de Paiva, Valeria

arXiv.org Artificial IntelligenceMay-6-2023

Blockchains provide a mechanism through which mutually distrustful remote parties can reach consensus on the state of a ledger of information. With the great acceleration with which this space is developed, the demand for those seeking to learn about blockchain also grows. Being a technical subject, it can be quite intimidating to start learning. For this reason, the main objective of this project was to apply machine learning models to extract information from whitepapers and academic articles focused on the blockchain area to organize this information and aid users to navigate the space.

artificial intelligence, entity and relationship, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2305.10408

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Banking & Finance > Trading (0.96)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Automatic Recognition and Classification of Future Work Sentences from Academic Articles in a Specific Domain

Zhang, Chengzhi, Xiang, Yi, Hao, Wenke, Li, Zhicheng, Qian, Yuchen, Wang, Yuzhuo

arXiv.org Artificial IntelligenceDec-28-2022

Future work sentences (FWS) are the particular sentences in academic papers that contain the author's description of their proposed follow-up research direction. This paper presents methods to automatically extract FWS from academic papers and classify them according to the different future directions embodied in the paper's content. FWS recognition methods will enable subsequent researchers to locate future work sentences more accurately and quickly and reduce the time and cost of acquiring the corpus. The current work on automatic identification of future work sentences is relatively small, and the existing research cannot accurately identify FWS from academic papers, and thus cannot conduct data mining on a large scale. Furthermore, there are many aspects to the content of future work, and the subdivision of the content is conducive to the analysis of specific development directions. In this paper, Nature Language Processing (NLP) is used as a case study, and FWS are extracted from academic papers and classified into different types. We manually build an annotated corpus with six different types of FWS. Then, automatic recognition and classification of FWS are implemented using machine learning models, and the performance of these models is compared based on the evaluation metrics. The results show that the Bernoulli Bayesian model has the best performance in the automatic recognition task, with the Macro F1 reaching 90.73%, and the SCIBERT model has the best performance in the automatic classification task, with the weighted average F1 reaching 72.63%. Finally, we extract keywords from FWS and gain a deep understanding of the key content described in FWS, and we also demonstrate that content determination in FWS will be reflected in the subsequent research work by measuring the similarity between future work sentences and the abstracts.

artificial intelligence, automatic recognition and classification, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2212.1386

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Which structure of academic articles do referees pay more attention to?: perspective of peer review and full-text of academic articles

Qin, Chenglei, Zhang, Chengzhi

arXiv.org Artificial IntelligenceSep-5-2022

Purpose The purpose of this paper is to explore which structures of academic articles referees would pay more attention to, what specific content referees focus on, and whether the distribution of PRC is related to the citations. Design/methodology/approach Firstly, utilizing the feature words of section title and hierarchical attention network model (HAN) to identify the academic article structures. Secondly, analyzing the distribution of PRC in different structures according to the position information extracted by rules in PRC. Thirdly, analyzing the distribution of feature words of PRC extracted by the Chi-square test and TF-IDF in different structures. Finally, four correlation analysis methods are used to analyze whether the distribution of PRC in different structures is correlated to the citations. Findings The count of PRC distributed in Materials and Methods and Results section is significantly more than that in the structure of Introduction and Discussion, indicating that referees pay more attention to the Material and Methods and Results. The distribution of feature words of PRC in different structures is obviously different, which can reflect the content of referees' concern. There is no correlation between the distribution of PRC in different structures and the citations. Research limitations/implications Due to the differences in the way referees write peer review reports, the rules used to extract position information cannot cover all PRC. Originality/value The paper finds a pattern in the distribution of PRC in different academic article structures proving the long-term empirical understanding. It also provides insight into academic article writing: researchers should ensure the scientificity of methods and the reliability of results when writing academic article to obtain a high degree of recognition from referees.

different structure, feature word, prc, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1108/AJIM-05-2022-0244

2209.01841

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.48)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Information Management (0.93)

Add feedback